Scikit-learn's batch logistic regression doesn't support a binomial response -- where your data is (successes, trials) rather than (success or failure). This can be a drag when you have a lot of trials with the exact same features, as your design matrix will be repetitive and much larger than necessary.
It isn't too bad to do it yourself though, which I'll show here. But first, my notation:
lambd in the code because lambda is reserved in pythonIn the code, I treat the intercept separately and don't regularize it, though it's pretty straightforward to modify things and include it as a normal feature.
In [1]:
from scipy.optimize import fmin_l_bfgs_b
from numpy import log, exp, hstack
def logistic(x):
return 1/(1+exp(-x))
def log_loss(y_orig, n, k, eps=1e-15):
y = y_orig.copy()
y[y < eps] = eps
y[y > 1 - eps] = 1 - eps
x = -k * log(y) - (n - k) * log(1 - y)
return x.sum()/n.sum()
def lbfgs_func(x, X, n, k, lambd=1.0):
intercept, theta = x[0], x[1:]
y = logistic(X.dot(theta) + intercept)
penalty = lambd/2*(theta*theta).sum()/n.sum()
obj = log_loss(y, n, k) + penalty
r = n*logistic(X.dot(theta)+intercept) - k
grad_intercept = r.sum()/n.sum()
grad_coefs = (X.T.dot(r) + lambd*theta)/n.sum()
grad = hstack([grad_intercept, grad_coefs])
return obj, grad
To use the code, you need your design matrix $X$ and a vector each of trials and successes, $n$ and $k$. It'll look something like this:
x0 = np.zeros(X.shape[1]+1)
x_optimal, objective, info = fmin_l_bfgs_b(lbfgs_func, x0, args=(X, n, k))
intercept, theta = x_optimal[0], x_optimal[1:]
Now intercept and theta will have your optimal parameters. To make predictions on new data, do something like this:
y_pred = logistic(X_new.dot(theta) + intercept)
Note that for large problems you might want to play with the defaults for fmin_l_bfgs_b; in particular I've found changing the factr keyword argument to be helpful.
In [2]:
import numpy as np
from sklearn import linear_model
In [3]:
X = np.matrix('1 0; 1 0; 1 0; 1 0; 0 1; 0 1; 0 1; 0 1').A
y = np.array([1, 1, 1, 0, 1, 0, 0, 0])
C = 10
clf = linear_model.LogisticRegression(C=C)
clf.fit(X, y)
clf.intercept_, clf.coef_
Out[3]:
In [4]:
X = np.array([[1, 0], [0, 1]])
n = np.array([4, 4])
k = np.array([3, 1])
lambd = 1/C
x0 = np.zeros(X.shape[1]+1)
x_optimal, *_ = fmin_l_bfgs_b(lbfgs_func, x0, args=(X, n, k, lambd))
intercept, theta = x_optimal[0], x_optimal[1:]
intercept, theta
Out[4]: